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This papers defines the syntax and semantics of GP 2, a revised version of the graph programming 
language GP. New concepts are illustrated and explained with example programs. Changes to the first 
version of GP include an improved type system for labels, a built-in marking mechanism for nodes 
and edges, a more powerful edge predicate for conditional rule schemata, and functions returning 
the indegree and outdegree of matched nodes. Moreover, the semantics of the branching and loop 
statement have been simplified to allow their efficient implementation. 

1 Introduction 

GP is an experimental nondeterministic programming language for high-level problem solving in the 
domain of graphs. The language is based on conditional rule schemata for graph transformation and has 
a simple syntax and semantics, to facilitate both understanding by programmers and formal reasoning on 
programs. The original version of GP (also referred to as GP 1 from now on) is defined in 12ii8j| and its 
protoype implementation is described in ||4l. 

Motivated by case studies in GP programming, the following changes and extensions feature in GP 2: 

• There are new types atom and list, the former representing the union of integers and character 
strings, the latter lists of atoms. Variables of these types can be declared in rule schemata. 

• Rule schemata can mark nodes and edges graphically. 

• Conditional rule schemata can check, by means of the edge predicate, whether there exists an edge 
with a particular label between two matched nodes. 

• The indegree or outdegree of a matched node can be accessed and used in the labels or in the 
condition of a rule schema. 

• The if-then-else statement of GP 1 is complemented by a try-then-else command whose then-part 
is executed on the graph resulting from the try-part. Also, a new or command provides explicit 
nondeterministic choice between subprograms. 

• Failure in evaluating the condition of a branching statement or the body of a loop does no longer 
enforce backtracking, in order to allow an efficient implementation of branching and looping. 

The rest of this paper is organised as follows. In Section |2] the graph transformation approach 
underlying GP is briefly reviewed, viz. the double-pushout approach with relabelling. Section |3] intro- 
duces conditional rale schemata, the building blocks of GP programs. The semantics of conditional rale 
schemata is defined in Section IH In Section [51 new features of GP 2 are demonstrated and explained 
by example programs. A formal operational semantics for GP 2 is presented and discussed in Section |6] 
Section|7]concludes by summarising GP's revision and addressing topics for future work. The Appendix 
lists the inference rules of the semantics of Section |6] 
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2 Graphs and Graph Transformation 

Graph transfomiation in GP is based on the double-pushout approach with relabelling This frame- 
work deals with partially labelled graphs whose definition is recalled below. In this section, we treat the 
label alphabet as a parameter because in subsequent sections we need different alphabets: graphs in rule 
schemata are labelled with expressions while graphs on which GP programs operate (also referred to as 
host graphs) are labelled with lists composed of integers and strings. 

A graph over a label alphabet ^ is a system G = {yG,EQ,SQ,tQ,lG,mG), where Vg and Eq are finite 
sets of nodes (or vertices) and edges, sctc- Eg Vg are the source and target functions for edges, 
^G- Vg ^'t^ is the partial node labelling function and mg : — > is the (total) edge labelling function. 
Given a node v, we write 1g{v) = -L to express that Ig{v) is undefined. Graph G is totally labelled if 1g is 
a total function. We write and $f ('^) for the class of graphs resp. totally labelled graphs over 'rf. 

Unlabelled nodes will occur only in the interfaces of rules and are necessary in the double-pushout 
approach to relabel nodes. There is no need to relabel edges as they can always be deleted and reinserted 
with different labels. 

A graph morphism g: G ^ H between graphs G,H in ^(^^ consists of two functions : Vg — > Vu 
and gE'- Eg ^ Eh that preserve sources, targets and labels; that is, snogE = gyosG, tnogE = gv°tG, 
niH °gE = niG, and ///(g(v)) = 1g{v) for all v such that 1g{v) 7^ -L. Morphism g is an inclusion if g{x) = x 
for all nodes and edges x. It is injective (surjective) if gy and gE are injective (surjective). It is an 
isomorphism if it is injective, surjective and satisfies ///(gy(v)) = _L for all nodes v with 1g{v) = -L. In 
this case G and H are isomorphic, which is denoted hy G = H. 

A rule r = {L <^ K ^ R) consists of two inclusions K ^ L and K ^ R such that L,R are graphs 
in '^{'lo) and K, the interface of r, is a graph in ^^(^j^). Intuitively, an application of r to a graph will 
remove the items in L — K, preserve K, add the items in R — K, and relabel the unlabelled nodes in K. 

Definition 1 (Rule application). Let r = (L ^ — > /?) be a rule, G a graph in and g: L G 

an injective graph morphism satisfying the dangling condition: no node in g{L) — g{K) is incident to 
an edge in G — g{L). We write G =^r,g H if H is isomorphic to the graph that is constructed from G as 
follows: 

1. Remove all nodes and edges in g{L) — g{K), obtaining a graph D. 

2. Add disjointly to D all nodes and edges from R — K, keeping their labels. For e € Er — Ek, su{e) 
is sii{e) if sii{e) G V« — Vk, otherwise gv{sR{e)). Targets are defined analogously. 

3. For each unlabelled node v in K, Inigviv)) becomes 1r{v). 

Figure [T] shows an example of a rule application. The rule in the upper row is applied to the left 
graph of the lower row, resulting in the right graph of the lower row. (For simplicity, we assume that 
all edge labels are the same and hence omit them.) The node identifiers 1 and 2 in the rule specify the 
inclusions of the interface. The middle graph of the lower row is obtained from graph D of Definition [T] 
by making all nodes unlabelled that are images of unlabelled nodes in K. Then the diagram represents a 
double-pushout in the category of graphs over (see [31). 

3 Syntax of Rule Schemata 

Conditional rule schemata are the principal programming construct in GP. Figure |2] shows an (artificial) 
example for the declaration of a rule schema containing some of the new features of GP 2. 
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Figure 1 : A rule application 



bridge(s,t: string; a: atom; n: int; x,y: list) 




12 3 12 3 



where (a = or a = "?") and not edge(l, 3, s.t) and outdeg(l) = indeg(3) 



Figure 2: Declaration of a conditional rule schema 

Besides the types int and string of GP 1, there are the new types atom and list. Type atom is the 
union of int and string, and list is the type of a (possibly empty) list of atoms. Given lists x and y, 
we write x : y for the concatenation of x and y. The colon replaces the underscore ' _' of GP 1 for better 
readability. Also, the empty list empty is now allowed (not to be confused with the empty character 
string ""). When drawing graphs, we represent the empty list by omitting the word empty. (Confusion 
with unlabelled nodes is not possible as long as we consider graphs on the left or right of a rule schema, 
or host graphs. This is because these graphs are totally labelled.) 

We identify lists of length one with their contents and hence get the syntactic and semantic subtype 
relationships shown in Figure |3] This is why we can form list expressions such as a : x and x : n in Figure 
121 where x is a list, a an atom and n an integer. For the same reason, equations in the condition such as 
a = or a = "?" can compare expressions of arbitrary list subtypes. 

Expressions in the left-hand side of a rule schema need no longer be constants or variables. Com- 
posite expressions such as a : x in Figure |2] are allowed if there is no ambiguity in matching individual 
variables with values in host graph labels. Similarly, the new dot operator ' . ' for string concatenation 
can be used in left-hand labels. (The exact condition for left-hand expressions is given in Definition |2l) 

The new functions indeg and outdeg access the indegree resp. outdegree of a left-hand node in the 
host graph. These operators may occur in the labels and the condition of a rule schema. Moreover, the 
binary edge predicate of GP 1 has now an optional third argument specifying the label of a possible edge 
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list (ZUChar*)* 

Ul Ul 
atom ZUChar* 
<^ ^ ^ 

int string Z Char* 



Figure 3: Subtype hierarchy for hsts 



between the given nodes. For example, the subcondition notedge(l, 3, s.t) in Figure |2] demands that 
there must not be an edge from node 1 to node 3 with label s . t (where the strings denoted by s and t 
are determined by matching the left-hand graph). 

Finally, GP 2 allows to mark nodes and edges graphically. For example, the outermost nodes in 
Figure |2] are marked by a grey shading, and the dashed arrow between nodes 1 and 3 in the right graph 
represents a marked edge. Marking is formalised below by defining labels as pairs of lists and boolean 
values, where a boolean value indicates whether a node or edge is marked or not. 

Figure m and Figure |5] give grammars in Extended Backus-Naur Form defining the abstract syntax 
of the labels and the condition of a rule schema. These grammars are ambiguous; in examples we use 
parentheses to disambiguate expressions if necessary. In the next section, the abstract syntax is used in 
defining the semantics of rule schemata. 



Integer 

ArithOp 

String 

Atom 

List 

Label 

Mark 



:= Digit {Digit} | IVariable | '— ' Integer | Integer ArithOp Integer 
(indeg | outdeg) '(' Node ')' 



'+' I '_' I I '/' 



= "" {Char} ""I S Variable | String '.' String 

= Integer | String | AVariable 

= empty | Atom | LVariable | List ':' List 

= List Mark 

= true I false 



Figure 4: Abstract syntax of rule schema labels 



The grammar in Figured defines four syntactic categories of expressions which can occur in a rule 
schema: Integer, String, Atom and List, where Integer and String are subsets of Atom which in turn is 
a subset of List. We assume that Node is the set of node identifiers occurring in the rule schema, which 
must be the same for the left and the right graph ({1,2,3} in Figure O. Moreover, IVariable, SVariable, 
AVariable and LVariable are the sets of variables of type int, string, atom and list that occur in the 
rule schema. These categories are disjoint since each variable must be declared with a unique type (see 
Figure 111). The mark components of labels are represented graphically rather than textually. 

The values of variables at execution time are determined by graph matching, hence we require that 
expressions in the left graph of a rule schema must have a simple shape. 



D. Plump 



5 



Definition 2 (Simple list). An expression e E List is simple if 

(1) e contains no arithmetic operators, 

(2) e contains at most one occurrence of a list variable, and 

(3) each occurrence of a string expression in e contains at most one occurrence of a string variable. 

For example, given the variable declarations of Figure |2] a : x and "no" . s : y : t are simple expressions 
whereas x : y and s . t are not simple. 

The syntax of a rule schema condition is defined by the grammar in Figure |5] New features are the 
predicates int, string and atom, which allow to check whether an expression belongs to a subtype 
of list, and equations between arbitrary list expressions. Also, the edge predicate can have a third 
parameter specifying the list component of an edge label. 

Condition Type '(' List ')' | List ('=' | ' ! =') List | 

Integer RelOp Integer | 
edge '(' Node ',' Node [',' List] ')' | 
not Condition | Condition (and | or) Condition 

Type ::= int | string | atom 

RelOp ::= '>' | '>=' | '<' | '<=' 

Figure 5: Abstract syntax of rule-schema condition 

Definition 3 (Conditional rule schema). A rule schema {L^ K ^ R) consists of two inclusions K ^ L 
and K ^ R such that L,R are graphs in (Label) and K consists of unlabelled nodes only. We require 
that all list expressions in L are simple and that all variables occurring in R also occur in L. A conditional 
rule schema {L ■(^ K ^ R, c) consists of a rule schema (L A' — )• /?) and a condition c € Condition such 
that all variables occurring in c also occur in L. 

When a conditional rule schema is declared, as in Figure |2] graph K is implicitly represented by the 
node identifiers in L and R (which much coincide). Hence nodes without identifiers in L are to be deleted 
and nodes without identifiers in R are to be created. 

The requirement that all variables in R must also occur in L ensures that for a given match of L in a 
host graph, applying r produces a unique graph (up to isomorphism). Similarly, the evaluation of c has a 
unique result if all its variables occur in L. 



4 Semantics of Rule Schemata 

While the left and right graph of a rule schema are labelled with elements from the syntactic category 
Label, host graphs are labelled with values from the following semantic domain 

^ = (ZUChar*)* xB 

where B = {true, false}. Hence semantic labels are sequences consisting of integers and character 
string^, paired with boolean values. As in the case of syntactic labels, the individual elements of a 



We assume that Char is a fixed set of characters. 
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sequence are separated by colons, the empty sequence is represented by "white space", and the boolean 
value true is represented graphically by shading resp. dashed lines. 

The application of a rule schema r with condition c to a graph G in ^^(^) proceeds roughly as 
follows: 

1 . Match the left graph L of r with a subgraph of G, ignoring labels, by means of a premorphism 
g:L^G. 

2. Check whether there is an assignment a of values to variables such that after evaluating the ex- 
pressions in L, g is label-preserving. 

3. Check whether the condition c evaluates to true. 

4. Apply the rule r^ ", obtained from r by evaluating all expressions in the left and right graph, to G. 

For example. Figure |6]shows an application of the rule schema bridge of Figure |2l The upper half 
of the diagram represents the instantiation of bridge according to premorphism g and the following 
assignment a: a i-> 0, x i-> 1 : 2, n i-> 3, y i— )• 4, s i— )• "o", t i— )• "k". The lower half of the diagram 
represents the application of the instance bridge^ " according to g. Note that the application condition 
of bridge (see Figure O is satisfied with respect to g and a. 



s.t 




Figure 6: An application of the rule schema bridge of Figure|2] 

In the remainder of this section, we make the above four steps precise. Consider a conditional rule 
schema r = {L ■(^ K ^ R, c). Given graphs G in ^(Label) and H in '^{^), a premorphism g: G ^ H 
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consists of two functions gv' Vg Vh and gE'. Eq Eh that preserve sources and targets: sh ogE = 
gyosG and tHogE= gvotc- 

An assignment is a family of mappings a = (o;x)A:e{i.s,A,L} where ttj : IVariable -^Z, as'. SVariable 
Char*, aA- AVariable — ZUChar* and ai^: LVariable — )• We sometimes omit the subscripts of 
these mappings as exactly one of them is applicable to a given variable. 

Given a premorphism g: L ^ G, an assignment a and a label I = em with e G List and m G 
{true, false}, the value € .if is the pair (e^", true) if m = true and (e^'", false) otherwise. 

The value e^ " G (ZUChar*)* is inductively defined. If e = empty, then e^'"' is the empty sequence. 
If e has the form d^.. . d„ (n > 1) with digits di,... ,dn, or has the form "ci . . . c„" {n > 0) with characters 
ci, . . . ,c„, then e^ " is the unique integer in Z resp. character string in Char* represented by e. (Note 
that the empty character string and empty^ " are different values.) If e is a variable, then e^ " = a{e). 
Otherwise, e" is obtained from the values of e's components. If e = —e\ with e\ G Integer, then e^ " 
is the integer opposite to If e has the form e\ ®e2 with © G ArithOp and e\,e2 G Integer, then 
gg,a _ ^g^a 0^ jj^g integer operation represented by ©H If e has the form indeg(?i) or 

outdeg(?2), with n G Node, then " is the indegree resp. outdegree of the node gv{n) in G. Finally, if 
e = e\.e2 with e\,e2 G String or e = e\'.e2 with ^1,^2 G List, then " is the concatenation of ef'" and 

4 ■ 

The value c^ " G B of the condition c is also inductively defined. If c has the form int(ei) with 
ei G List, then c^ " = true if and only if gj'" G Z. Similarly, if c has the form string(gi) or atom(ei), 
then c^'" = true if and only if e^' G Char* resp. gj' G Z U Char*. If c has the form e\ =e2 or e\ \ =e2 
with ^1,^2 S List, then c^" = true if and only if e\"' = ef' " resp. ef" / ef". If c has the form ixi e2 
with [XI G RelOp and ei , ^2 in Integer, then c^ " = true if and only if ^j' ixi^ ^1' where cxi^ is the integer 
relation represented by M. 

If c has the form edge(m,?i) with m,n £ Node, then c^ " = true if and only if there is an edge in G 
from gv{m) to gvin). Similarly, if c has the form edge(m,?i,e) with m,n G Node and e G List , then 
^g,a _ ^j^j Qjjiy there an edge from gv{m) to ^\/(«) with a label whose list component is e^'". 

If c has the form notci with ci G Condition, then c^ " = true if and only if cf" = false. Finally, if c 
has the form ci andc2 with ci,C2 G Condition, then c^ " = true if and only if cf" = true = and if c 
has the form ci orc2, then c^" = true if and only cf" = true or cf' " = true. 

We call r^'" = (L^ " ^ — > 7?^ ") the instance of r with respect to ^ and a, where L^ " and 7?^ " are 
obtained from L and R by replacing each label / with l^'". Note that r* " is a graph transformation rule 
over if, in the sense of Section |2l We can now define the application of conditional rule schemata to 
graphs in §f(^). 

Definition 4 (Rule-schema application). Given a conditional rule schema r = {L K ^ R, c) and graphs 
G,H in ^^(^), we write G ^r,g H (or just G =^r H) if there are a premorphism g: L ^ G and an 
assignment a such that 

(1) ^ is a graph morphism L^ " G, 

(2) c^'" = true, and 

(3) G ^yS-ag H . 

Here G ^rs ",g H denotes the application of r^ " with match g to G, as defined in Section|2l Note that 
we use ^ for the application of both rule schemata and rules, to avoid an inflation of symbols. Given a 
set ^ of conditional rule schemata, we write G H if G =^r H for some conditional rule schema r in 
^. 

^The effect of dividing by zero is undefined, tiiat is, left to the implementation. 
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The following proposition shows that given a rule schema r, a premorphism from the left-hand graph 
of r to G induces at most one instance of r that can be applied with match g. 

Proposition. Given a conditional rule schema r= {L K ^ R, c) and a premorphism g: L^G, there 
exists at most one assignment a such that g is a graph morphism L^ " G. 

The proof of this property relies on the fact that the left-hand graph L contains only simple expres- 
sions. 



5 Programs 

The syntax of graph programs is the same as in GP 1, except for the syntax of rule schemata and the new 
constructs try_then_else and or. Figure [T] shows the abstract syntax of GP 2 programs. As before, a 
program consists of a number of declarations of conditional rule schemata and macros, and exactly one 
declaration of a main command sequence. The identifiers of category Ruleld occurring in a RuleSetCall 
refer to declarations of conditional rule schemata in category RuleDecl (see previous sections). 



Prog ::= Decl {Decl} 

Decl ::= RuleDecl | MacroDecl | MainDecl 

MacroDecl ::= Macrold ComSeq 

MainDecl ::= main '=' ComSeq 

ComSeq ::= Com{';'Com} 

Com ::= RuleSetCall [ MacroCall 

I if ComSeq then ComSeq [else ComSeq] 

I try ComSeq then ComSeq [else ComSeq] 

I ComSeq '!' 

I ComSeq or ComSeq 

I skip I fail 

RuleSetCall ::= Ruleld | '{' [Ruleld {',' Ruleld}] '}' 
MacroCall Macrold 



Figure 7: Abstract syntax of programs 



In the next section it is shown that the commands or, skip and fail can be expressed through 
the other commands. Hence the core of GP includes only the call of a set of conditional rule schemata 
(RuleSetCall), sequential composition (';'), the if-then-else statement, the try-then-else statement and 
as-long-as-possible iteration ('!'). Before formally defining the semantics of programs, we discuss some 
example programs to illustrate the use of the new features of GP 2. 

Example 1 (Checking connectedness). A graph is connected if there is an undirected path between each 
two nodes, that is, a sequence of consecutive edges whose directions don't matter. The program in Figure 
[8] checks whether an arbitrary input graph G is connected and, depending on the result, executes either 
program P or program Q on G. 

Connectedness is checked by picking some node, marking it, and propagating node marks along 
edges as long as possible. Then an application of the rule schema unmarked tests whether any unmarked 
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nodes are left. If this is the case, then the macro disconnected succeeds and program Q is executed, 
otherwise disconnected fails and program P is executed. 

It is important to note that P or 2 is executed on the input graph whereas the graph resuhing from 
the test is discarded. The precise semantics of the branching command is given in Section [6l □ 



main = if disconnected then Q else P 
disconnected = pick; {growl, grow2} ! ; unmarked 



pick(x: list) 




1 1 



growl(a,x,y: list) 




12 12 



grow2(a,x,y: list) 




unmarked(x: list) 




1 1 



Figure 8: A program for checking connectedness 

Example 2 (Recognising acyclic graphs). A graph is acyclic if it does not contain a directed cycle. The 
program in Figure |9]checks whether an unmarked input graph G is acyclic and, depending on the result, 
executes either program P or program Q on G. 

The absence of cycles is checked by deleting, as long as possible, edges whose source nodes have no 
incoming edges, and testing subsequently whether any edges remain. This method relies on the following 
invariant of the rule schema delete: for every step G ^delete H, G is acyclic if and only if H is acyclic. 
Moreover, a graph to which delete is not applicable is acyclic if and only if it does not contain edges. 
Note that the condition of delete uses the new indegree function. □ 

Example 3 (Recognising series-parallel graphs). Series-parallel graphs are inductively defined as fol- 
lows. Every graph G consisting of two nodes connected by an edge is series-parallel, where the edge's 
source and target are the source and target of G. Given series-parallel graphs G and H, the graphs ob- 
tained from the disjoint union G + H hy the following two operations are also series-parallel. Serial 
composition: merge the target of G with the source of H; the source of G becomes the new source and 
the target of H becomes the new target. Parallel composition: merge the source of G with the source of 
H, and the target of G with the target of H; sources and targets are preserved. 
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main = if acyclic then P else Q 

acyclic = delete!; if {edge, loop} then fail 



delete(a,x,y : list) 




12 12 



where indeg(l) = 



edge(a,x,y: list) 




loop(a,x: list) 

02a Q^a 

1 1 



Figure 9: A program for recognising acyclic graphs 

It is known fT, ^ that a graph is series-parallel if and only if it reduces to a graph consisting of 
two nodes connected by an edge (a base graph) by repeated application of the following operations: (a) 
Given a node with one incoming edge / and one outgoing edge o such that s{i) ^t{o), replace /, o and 
the node by an edge from s{i) to t{o). (b) Replace a pair of parallel edges by an edge from their source 
to their target. 

Figure [To] shows a macro which reduces every unmarked series-parallel graph to the empty graph, 
and fails on every other unmarked graph. The subprogram reduce ! applies as long as possible the 
operations (a) and (b) to the input graph G, then the rule schema delete-base checks if the result is a 
base graph whose nodes are not incident to other edges. (The latter is ensured by the dangling condition.) 
If delete-base is not applicable, then the input graph was not reduced to a base graph. In this case the 
input graph is not series-parallel because every execution of reduce ! yields the same graph. (This is 
because the critical pairs of the rule schemata serial and parallel are strongly joinable ||6].) 

Finally, after delete-base has been appUed, the rule schema nonempty checks whether the graph 
resulting from reduce ! contains nodes other than those of the base graph. The input graph is series- 
parallel if and only if this is not the case. □ 

Example 4 (Computing Euler cycles). An Euler cycle is a directed cycle of distinct edges that contains 
all edges and nodes of a graph. A graph is eulerian if it contains an Euler cycle. It is known that a graph 
is eulerian if and only if it is connected and each node has the same indegree as outdegree |[l]. Based on 
this characterisation, the macro eulerian of Figure [TT] checks whether an unmarked graph is eulerian 
or not. It does this by using the macro disconnected of Figure [8]and the new indegree and outdegree 
functions. 

Given an unmarked eulerian input graph with atomic labels, the program in Figure [12] computes an 
Euler cycle and numbers its edges. An execution of this program is shown in Figure [13] In the resulting 
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series-parallel = reduce! ; delete-base ; if nonempty then fail 
reduce = {serial, parallel} 



serial(a,b,x,y,z: list) 




1 2 12 



parallel(a,b,x,y : list) 




delete-base(a,x,y : list) 




nonempty(x: list) 




1 1 



Figure 10: A macro for recognising series-parallel graphs 

graph, the computed Euler cycle is given by the edges with the labels 1:1, 1:1:1, 1:1:2, 1:1:3, 1:2, 
1 :3 and 1 :4. The graph in the middle of Figure [13] is an intermediate result representing the point in 
time when the macro cycle has been executed for the first time. 

The program uses the command try_then to check if the input graph is nonempty. If the input graph 
is empty, then the empty sequence of edges is an Euler cycle and hence the program returns the empty 
graph. If the input graph is nonempty, the rule schema init picks some node, adds to its label, and 
marks the node. Then the rule schema loop numbers all loops with atomic labels that are incident to the 
node. Next the rule schema cycle numbers a proper (that is, non-loop) cycle starting at this node, by 
repeatedly applying the rule schema grow. Also, at each visited node, loop is applied as long as possible 
to number all incident loops. 

When the first proper cycle has been numbered, the subprogram (next ; cycle) ! repeatedly com- 
putes a new cycle starting at a node that has already been visited. This cycle is inserted into the current 
cycle by numbering the new edges with lists that add one position to the list of the edge preceding the new 
edges. Finally, when all edges of the graph have been numbered, the rule schema clean-up removes all 
auxiliary information in node labels. □ 

6 Operational Semantics 

This section presents a formal semantics for GP 2 in the style of Plotkin's structural operational semantics 
ll5l . As usual for this approach, inference rules inductively define a small-step transition relation — )• on 
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eulerian = if disconnected then fail; if unbalanced then fail 
disconnected = ... 



unbalanced(x : list) 




1 1 



where indeg(l) !=outdeg(l) 



Figure 11 : A macro for recognising eulerian graphs 

configurations. In our setting, a configuration is either a command sequence together with a graph, just 
a graph or the special element fail: 

^ C (ComSeq x ^(if)) x ((ComSeq x ^(^)) U^(^) U {fail}). 

Configurations in ComSeq x given by a rest program and a state in the form of a graph, represent 

states of unfinished computations while graphs in ^(^) are proper results. In addition, the element fail 
represents a failure state. A configuration 7 is said to be terminal if there is no configuration 8 such that 

Figure [14] in the Appendix shows the inference rules for the core commands of GP 2. Each rule 
consists of a premise and a conclusion separated by a horizontal bar. Both parts contain meta-variables 
for command sequences and graphs, where R stands for a call in category RuleSetCall, C,P,P',Q stand 
for command sequences in category ComSeq, and G,H stand for graphs in §f(^). Meta-variables are 
considered to be universally quantified. For example, the rule [calli] reads: "For all R in RuleSetCall and 
all G,H in ^{^), G H implies (/?, G) — )■ H." The transitive and reflexive-transitive closures of 
are written -^^ and — )•*, respectively. The notation G expresses that for graph G in ^^(^) there is 
no graph H such that G =>r H. 

The if -then-else command has been designed to "hide" destructive tests. In Example [TJ for instance, 
the test of the if-then-else command produces a graph with marked nodes. By the inference rules [ifi] and 
[if2], this graph is discarded and program P or 2 is executed on the input graph. In contrast, a program 
try C then P else Q passes any graph resulting from its test to P. If test C fails, however, Q is executed 
on the input graph. 

The semantics of the if-then-else command and the as-long-as-possible loop in GP 1 have been 
modified to allow an efficient implementation. Previously, the conditions of branching commands and 
the bodies of loops were tested, in the worst case, by trying all possible executions starting from the 
current graph. This made branching and loop commands impractical for complex tests or large input 
graphs. In GP 2, the semantics of if C then P else Q, try C then P else Q, and Bl do not enforce 
backtracking when C or B fails. Instead, control is passed to program Q or the loop is terminated, 
respectively. Note that this change increases the nondeterminism of evaluation in cases where C or 5 can 
both succeed and fail on the input graph. 

The inference rules for the remaining GP commands are given in Figure [T5]of the Appendix. These 
commands are referred to as derived commands because they can be defined by the core commands, as 
shown below. 

The meaning of GP 2 programs is summarised by the semantic function [[_| which assigns to each 
program P the function |P] mapping an input graph G to the set of all possible results of executing P 
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main = try init; loop! then (cycle; (next; cycle)!; clean-up!) 
cycle = (grow; loop! ) ! ; vmmark 
next = first; loop! 



init (x : atom) 




1 1 



loop (a, x: atom; u:list; i:int) 




1 1 



grow(a,x,y:atom; i:int; u,v:list) 




12 12 



unmark(u: list) 

1 1 



first(a,x,y:atom; u,v:list) 




12 12 



where u != empty 
clean-up (x: atom; u:list) 




1 1 



where u != empty 



Figure 12: A program for computing an Euler cycle 
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1 1:3 1:3 



Figure 13: An execution of the program of Figure [12] 

on G. The application of [Pj to G is written IPJG. The result set may contain, besides proper results 
in the form of graphs, the special values fail and _L. The value fail indicates a failed program run while 
_L indicates a run that does not terminate or gets stuck. Program P can diverge from G if there is an 
infinite sequence {P, G) {Pi, Gi) ^ {P2, G2) — > . . . Also, P can get stuck from G if there is a terminal 
configuration {Q, H) such that (P, G) {Q, H). 

Definition 5 (Semantic function). The semantic function [_]] : ComSeq {'^{^) 2^(^)^{^^"'^>) is 
defined by 

[PIG = {X G (^(^) U {fail}) I (P, G) 4x} U {_L | P can diverge or get stuck from G}. 

In the current implementation of GP, reaching the failure state triggers backtracking which then 
attempts to find a proper result |@]. However, backtracking can be switched off by the user. 

A program can get stuck in two situations: (1) it contains a command if C then P else Q or 
try C then P else Q such that C can diverge from some graph G and can neither produce a proper 
result from G nor fail from G, or (2) it contains a loop B\ whose body B possesses the said property 
of C. The evaluation of such commands gets stuck because none of the inference rules for if-then-else, 
try-then-else or iteration is applicable. 

The semantic function of Definition [5] suggests a straightforward notion of program equivalence. 
Definition 6 (Semantic equivalence). Two programs P and Q are semantically equivalent, denoted by 
P^QMIP\ = IQI 

For example, it is easy to see that the following equivalences between derived commands and core 
commands hold (where is the empty graph): 

• skip = null, where null is the rule schema 0; 

• f ail = {}, where {} is the empty set of rule schemata; 

• if C then P = if C then P else null, for all programs C and P; 

• try C then P = try C then P else null, for all programs C and P. 

Less obvious is the following equivalence, showing that or is a derived command: 

PozQ = if remove!; {create, null}; zero thenP else Q, 

for all programs P and Q. Here remove is a set of three rule schemata that delete arbitrary edges, loops 
and isolated nodes, create is the rule schema 



0^® 
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and zero is the rule schema 

The following non-equivalence may be surprising, too: 

try C then P else Q ^ if C then C,P else Q. 

To witness, choose C = skip or fail, P = skip and 2 = skip. Then the try-program is equivalent to 
skip and hence cannot fail, but the if -program can fail. 

7 Conclusion 

GP allows high-level problem solving in the domain of graphs, by supporting rule-based programming 
and freeing programmers from dealing with low-level data structures for graphs. The language has a 
simple syntax and semantics, to facilitate both understanding by programmers and formal reasoning on 
programs. 

The revised language GP 2 has an improved type system, including list variables and subtypes, a new 
concept of marking nodes and edges graphically, new built-in functions for accessing the indegree and 
the outdegree of nodes, a more powerful edge predicate for conditions, new commands try-then-else 
and or, and a simplitied semantics of branching and looping to enable an efficient implementation. 

Topics for future work include the implementation of GP 2, tool support for Hoare-style program 
verification @, and static analyses for properties such as termination and confluence. 

Acknowledgements. Parts of this paper were written while visiting Annegret Habel in Oldenburg and 
Berthold Hoffmann in Bremen in the autumn of 2011. I am grateful for their hospitality. Thanks go also 
to the Plasma research group in York for helpful comments, especially to Cohn Runciman for proposing 
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Appendix: Semantic Inference Rules 



r 1 jP^G) jP'^H) 
^'^"^'^ {P;Q,G)^{P';Q,H) 



[seqs] 



[ifi] 



{P, G) fail 
{P;Q,G)^M\ 



{C, G) ^+ H 



(if C then P else Q, G) {P, G) 



r n (C, G)^+H 

^^^J (try C then P else Q, G) {P, H) 



r , 1 {P^ G) ^+ H 
^^^^PiJ {P\,G)^{P\,H) 



[seq2 



{P, G) 
{P;Q,G)^{Q,H) 



[if2] 



(C, G) ^+ fail 



(if C then P else Q, G) {Q, G) 

(C, G) ^+ fail 

(try C then P else Q, G) {Q, G) 



[alap2] 



{P, G) ^+ fail 



(P!,G) 

Figure 14: Inference rules for core commands 



on] (For Q,G)^{P,G) 
skip] (skip, G) ^G 

(C, G) ^+ // 



if- 



(if C then P, G) ^ {P, G) 



[trya] 



H 



[0V2] {P or Q,G)^{Q,G) 
[fail] (fail, G) ^ fail 
(C, G) ^+ fail 



[if. 



(if C then P,G)^G 



. (C, G) ^+ fail 
"^"^J (try C then P,G)-^G 



(try C then P, G) {P, H) 

Figure 15: Inference rules for derived commands 



